Search CORE

22 research outputs found

motilitAI: a machine learning framework for automatic prediction of human sperm motility

Author: Amiriparian Shahin
Gerczuk Maurice
Ottl Sandra
Schuller Björn W.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2022
Field of study

In this article, human semen samples from the Visem dataset are automatically assessed with machine learning methods for their quality with respect to sperm motility. Several regression models are trained to automatically predict the percentage (0–100) of progressive, non-progressive, and immotile spermatozoa. The videos are adopted for unsupervised tracking and two different feature extraction methods—in particular custom movement statistics and displacement features. We train multiple neural networks and support vector regression models on the extracted features. Best results are achieved using a linear Support Vector Regressor with an aggregated and quantized representation of individual displacement features of each sperm cell. Compared to the best submission of the Medico Multimedia for Medicine challenge, which used the same dataset and splits, the mean absolute error (MAE) could be reduced from 8.83 to 7.31. We provide the source code for our experiments on GitHub (Code available at: https://github.com/EIHW/motilitAI)

OPUS Augsburg

PubMed Central

Sentiment analysis using image-based deep spectrum features

Author: Amiriparian Shahin
Cummins Nicholas
Gerczuk Maurice
Ottl Sandra
Schuller Björn
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

OPUS Augsburg

Audio-based eating analysis and tracking utilising deep spectrum features

Author: Amiriparian Shahin
Gerczuk Maurice
Ottl Sandra
Pugachevskiy Sergey
Schuller Björn
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

OPUS Augsburg

Crossref

Bag-of-deep-features: noise-robust deep feature representations for audio analysis

Author: Amiriparian Shahin
Cummins Nicholas
Gerczuk Maurice
Ottl Sandra
Pugachevskiy Sergey
Schuller Björn
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

OPUS Augsburg

Crossref

Multimodal bag-of-words for cross domains sentiment analysis

Author: Amiriparian Shahin
Cummins Nicholas
Gerczuk Maurice
Ottl Sandra
Schmitt Maximilian
Schuller Björn
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

OPUS Augsburg

Towards cross-modal pre-training and learning tempo-spatial characteristics for audio recognition with convolutional and recurrent neural networks

Author: Amiriparian Shahin
Baird Alice
Gerczuk Maurice
Koebe Lukas
Ottl Sandra
Schuller Björn
Stappen Lukas
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

OPUS Augsburg

Audio-based recognition of bipolar disorder utilising capsule networks

Author: Amiriparian Shahin
Awad Arsany
Baird Alice
Gerczuk Maurice
Ottl Sandra
Schuller Björn
Stappen Lukas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

OPUS Augsburg

Crossref

A machine learning framework for automatic prediction of human semen motility

Author: Amiriparian Shahin
Gerczul Maurice
Ottl Sandra
Schuller Björn W.
Publication venue
Publication date: 23/12/2021
Field of study

OPUS Augsburg

EmoNet: a transfer learning framework for multi-corpus speech emotion recognition

Author: Amiriparian Shahin
Gerczuk Maurice
Ottl Sandra
Schuller Björn W.
Publication venue
Publication date: 10/03/2021
Field of study

In this manuscript, the topic of multi-corpus Speech Emotion Recognition (SER) is approached from a deep transfer learning perspective. A large corpus of emotional speech data, EmoSet, is assembled from a number of existing SER corpora. In total, EmoSet contains 84181 audio recordings from 26 SER corpora with a total duration of over 65 hours. The corpus is then utilised to create a novel framework for multi-corpus speech emotion recognition, namely EmoNet. A combination of a deep ResNet architecture and residual adapters is transferred from the field of multi-domain visual recognition to multi-corpus SER on EmoSet. Compared against two suitable baselines and more traditional training and transfer settings for the ResNet, the residual adapter approach enables parameter efficient training of a multi-domain SER model on all 26 corpora. A shared model with only

3.5

times the number of parameters of a model trained on a single database leads to increased performance for 21 of the 26 corpora in EmoSet. Measured by McNemar's test, these improvements are further significant for ten datasets at

p<0.05

while there are just two corpora that see only significant decreases across the residual adapter transfer experiments. Finally, we make our EmoNet framework publicly available for users and developers at https://github.com/EIHW/EmoNet. EmoNet provides an extensive command line interface which is comprehensively documented and can be used in a variety of multi-corpus transfer learning settings.Comment: 18 pages, 7 figure

arXiv.org e-Print Archive

OPUS Augsburg